Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic response headers #460

Closed
wants to merge 5 commits into from

Conversation

jimlawhorn
Copy link

#360

Implements a similar concept to the two referenced implementations. Including an optional headers.json allows for adding any response header based on a requested path and multiple matches can be appended together. It can also be used to override command-line headers for specific paths.

Major differences

  • Headers are stored in a json file at the root and is read on startup, not on request. This limits to a single read.
  • Implemented globbing (minimatch), which should allow for more robust pattern matching e.g being able to specify *.js rather than a config at a directory or specific file level.

@BigBlueHat
Copy link
Member

I do think the top-level file is probably going to handle the most cases. However, I do like the "in place" .header approach as it avoids the need to update a "parent" file and/or restart the server when changes happen within the site. Thoughts?

@jimlawhorn
Copy link
Author

I think there are pros and cons to both approaches (in directory vs global).

From my perspective, probably the hardest problem to solve from an in directory model is the inheritance aspect of it. If a requested file is several directories deep and each directory has a .headers file, how do you determine which to apply? While most standard headers only allow for a single value, that's not necessarily true of custom headers. If we were to say that lowest-level (e.g. closest to the file requested) "wins" and headers aren't appended as you go up the chain, it turns into a potential maintenance issue where you're setting the same headers in multiple .headers files. If we were to go the other way and allow files up the chain to append, researching where a header came from would be difficult - and that assumes we knew which could be appended vs which must be a single value.

If I'm understanding at least one of your concerns correctly, it's the need for a restart on changes. I think if I added a file watcher to the global header file so that it got reloaded when updated, it would address that concern. If you were to add new files to the site, you would either be okay with the existing rules and no header changes were necessary, or you would add headers to the global - at which point a reload would occur.

Having said all that, I really think we should support both (in-place and global), but I would propose the following:

  • In-place only applies to the directory it's in, not child directories. There is no inheritance.
  • Once a header is set, it is not overridden by the global.
  • The global will append headers if the append setting is on.
  • Directory .headers are checked on each request.
  • Global header is updated on startup or based on a file watcher.

Taking the combined approach:

  • Allows you to be very specific when necessary and not pollute the global with a bunch of one-offs.
  • It still allows for the global to prevent additional maintenance (headers file in every directory for .js).
  • Simplifies trouble-shooting. The header is either coming from the directory or from the global.

What do you think?

@BigBlueHat
Copy link
Member

@jimlawhorn thanks for reasoning through all that! Inheritance wasn't on my personal list of "wants" for this feature. The whole idea came from the W3C's Web Platform Tests server--used to test all the browsers. They have several different "handlers" one of which is the File Hanlder:
https://wptserve.readthedocs.io/en/latest/handlers.html#file-handlers

That describes essentially what I'd personally like to see. Essentially, the "most" it does is check three total files at request time (so no "global" and no inheritance):

/example-folder/
 - __dir__.headers
 - index.html.headers
 - index.html

A request for /example-folder/ or /example-folder/index.html or /example-folder/index would result in all three of those files being read. However, they're minimal reads with no parsing as the .headers files are "raw" HTTP headers. So, it's essentially a raw concatenation of __dir__.headers and index.html.headers + the response body being index.html (also unparsed/processed). Additionally, the __dir__.headers file only effects it's sibling files--i.e. it doesn't "cascade" into sub-directories (so no inheritance in that sense).

What this allows is for multiple directories from multiple sources to be tossed into one web space and served without a "global" file being consulted and/or changed to deal with the new scenarios. Each directory is "self sovereign" in that sense.

It also avoids additional file monitoring, reloading, etc as everything is read at request time.

It's sort of a Principle of Least Astonishment thing. 😃 It means you can analyze each requests potential response output simply by reading these 1-3 raw sibling files--vs. consulting, parsing (mentally or otherwise) a file elsewhere in the hierarchy.

It is certainly a more limited approach, but it also keeps http-server dumb by design. 😄 Something I find valuable for longevity and stability.

Thoughts?

@jimlawhorn
Copy link
Author

I would argue that Principle of Least Astonishment would indicate that we should follow what other popular web servers are doing. Taking cues from Apache HTTP server, Nginx, Netlify, etc. would lead to a single configuration file at the root (or virtual host) that is only read at server start.

While some implementations do allow for directory specific configurations (like Apache's .htaccess), it is generally not advised. Nginx goes so far as to say it's wrong

I think there are use cases for both, which is what I'm advocating. I only have use cases for the dynamic headers at the root level. I'm happy to also implement the directory/file-level headers that you have described, but implementing only directory/file-level headers is not something that I'm interested in contributing.

Thoughts?

@thornjad thornjad added this to the Custom headers milestone Aug 16, 2021
@thornjad
Copy link
Member

Sorry to jump in after so much time, I agree that a directory-level configuration is best long-term. However, least astonishment for me makes me want to call the file .headers. The dot ensures it's usually hidden, because it's not really a "real" file to be served.

I also like reading on startup, rather than on request.

I'm unsure about the JSON format and whether it introduces more complexity than a "simple http server" should offer? I'm not entirely convinced in any direction.

In the case we do want the added complexity, what would you think about a more powerful and less verbose format like YAML or something else?

@thornjad thornjad removed this from the Custom headers milestone Aug 16, 2021
@thornjad
Copy link
Member

Due to general inactivity and my comments above, I'm going to close this in favor of the older #282

@thornjad thornjad closed this Oct 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants